Blob storage artifact-data keys, binary sanitizer, and observability stripping by mike-inkeep · Pull Request #2744 · inkeep/agents

mike-inkeep · 2026-03-18T14:48:33Z

Summary

Adds an artifact binary sanitizer that offloads inline binary data (images, files) embedded in artifact payloads to blob storage, and strips those payloads from observability spans before they reach the tracing backend. Also introduces a dedicated artifact-data storage key category, fixes Vercel file part mapping to use url instead of text, and adds inlineExternalPdfUrlParts to support downloading external PDF URLs as inline bytes.

Key decisions

Sanitizer runs at persist time, not at generation time. Artifact data is sanitized when it is written to storage, so any consumer reading back blob URIs gets a stable reference. An alternative (sanitizing at read time) was not chosen because it would require every read path to do async work.

Content-hash-based artifact-data storage keys. The key structure is v1/t_{tenantId}/artifact-data/p_{projectId}/a_{artifactId}/sha256-{hash}.{ext}. Using a content hash means duplicate binary payloads within the same artifact are deduplicated automatically, and keys are deterministic without needing to generate UUIDs.

stripBinaryDataForObservability is synchronous and non-destructive. Rather than uploading to blob storage before writing spans (which would add latency and an async dependency to the span-writing hot path), binary parts are replaced with a [binary data ~N bytes, mimeType: X] placeholder. This keeps span sizes small without blocking.

inlineExternalPdfUrlParts fetches PDFs before the upload pipeline. Previously, external PDF URIs in file parts were explicitly blocked. This helper downloads them early (in buildPersistedMessageContent) and converts to inline bytes so the rest of the upload pipeline can process them uniformly.

Manual QA

No manual QA performed. Consider running /qa to generate and execute a test plan.

…l sites

…ility logs

changeset-bot · 2026-03-18T14:48:40Z

⚠️ No Changeset found

Latest commit: 7dd41d8

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

vercel · 2026-03-18T14:48:56Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agents-api	Ready	Preview, Comment	Mar 26, 2026 2:01am
agents-manage-ui	Ready	Preview, Comment	Mar 26, 2026 2:01am

1 Skipped Deployment

Project	Deployment	Actions	Updated (UTC)
agents-docs	Skipped		Mar 26, 2026 2:01am

claude

PR Review Summary

(6) Total Issues | Risk: Medium

🟠⚠️ Major (2) 🟠⚠️

🟠 1) artifact-binary-sanitizer.ts:53 Missing error handling for storage upload failures

Issue: The uploadInlinePart function calls storage.upload() without a try-catch block. If the blob storage upload fails (network error, quota exceeded, permission denied), the promise rejection propagates and could fail the entire artifact processing pipeline.

Why: The existing pattern in image-upload.ts (lines 90-98) gracefully catches upload failures and drops the problematic part while preserving the rest of the content. This inconsistency creates a reliability gap where a transient storage failure could cause complete artifact processing failure rather than graceful degradation.

Fix: Add error handling around the upload call that either (1) returns the original part with base64 data intact as a fallback, or (2) replaces the data with an error placeholder. The first option preserves data at the cost of larger observability payloads; the second maintains observability cleanliness but loses the data.

Refs:

image-upload.ts:90-98 — existing error handling pattern

🟠 2) sanitizeArtifactBinaryData Exported function is not used in production code

Issue: The PR adds two functions: stripBinaryDataForObservability (used in 4 places in AgentSession.ts) and sanitizeArtifactBinaryData (an async function that uploads binary data to blob storage). However, sanitizeArtifactBinaryData is exported and tested but never imported or called anywhere in the codebase outside its test file.

Why: This means the artifact-data storage key category is added but no code path actually uploads artifacts using it. The infrastructure exists but is not wired up to any data flow. This may be intentional preparatory work, but it's unclear from the PR.

Fix: Either (1) defer adding sanitizeArtifactBinaryData until the consuming code is ready, (2) document in the PR description that this is preparatory infrastructure with follow-up work planned, or (3) wire up the function to the appropriate data flow (e.g., artifact persistence in ArtifactService).

Refs:

artifact-binary-sanitizer.ts:58 — unused function

Inline Comments:

🟠 Major: artifact-binary-sanitizer.ts:53 Missing error handling for storage upload failures

🟡 Minor (4) 🟡

🟡 1) storage-keys.ts:37-45 Missing unit tests for new artifact-data storage key category

Issue: The existing storage-keys.test.ts tests buildStorageKey for the media category but was not updated to test the new artifact-data category.
Why: The storage key format is critical for data integrity and multi-tenant isolation. A regression in the key format could cause lookup failures or tenant data leakage.
Fix: Add a test case in storage-keys.test.ts:

it('builds versioned artifact-data storage key', () => {
  const key = buildStorageKey({
    category: 'artifact-data',
    tenantId: 'tenant-1',
    projectId: 'project-1',
    artifactId: 'artifact-1',
    contentHash: 'abc123',
    ext: 'png',
  });
  expect(key).toBe('v1/t_tenant-1/artifact-data/p_project-1/a_artifact-1/sha256-abc123.png');
});

🟡 2) artifact-binary-sanitizer.ts:27 Magic number threshold for binary detection

Issue: The 100 character threshold for detecting binary data is hardcoded without explanation.
Why: This represents only ~75 decoded bytes. Without documentation, it's unclear if this was deliberately chosen or if a larger threshold would be more appropriate.
Fix: Extract to a named constant: const BINARY_DATA_MIN_LENGTH = 100; // ~75 decoded bytes

🟡 3) artifact-binary-sanitizer.test.ts Missing test for data: URI handling

Issue: The implementation checks !v.data.startsWith('data:') but there's no explicit test.
Why: Data URIs are common for inline images; without a test, this check could be accidentally removed.
Fix: Add test case for data URIs.

🟡 4) artifact-binary-sanitizer.test.ts Missing test for upload failure scenario

Issue: Tests mock upload to always succeed but never test the failure path.
Why: Masks the missing error handling in the implementation.
Fix: Add failure test case.

Inline Comments:

🟡 Minor: artifact-binary-sanitizer.ts:19 Magic number threshold lacks documentation
🟡 Minor: artifact-binary-sanitizer.test.ts:109 Missing test for upload failure handling
🟡 Minor: artifact-binary-sanitizer.test.ts:57 Missing test for data: URI handling

💭 Consider (2) 💭

💭 1) storage-keys.ts Missing companion parse/prefix functions for artifact-data category

Issue: The media category has parseMediaStorageKey and buildMediaStorageKeyPrefix helpers, but artifact-data only has the buildStorageKey case.
Why: May be needed later for cleanup, authorization checks, or debugging. If not needed, this is acceptable.
Fix: Add a comment explaining why parse/prefix helpers are not needed, or add them for consistency.

💭 2) AgentSession.ts Pretty-print JSON in span attributes

Issue: Uses JSON.stringify(..., null, 2) for span attributes which adds unnecessary whitespace.
Why: Span attributes are stored and indexed; compact format reduces storage overhead.
Fix: Remove null, 2 arguments: JSON.stringify(stripBinaryDataForObservability(artifactData.data))

💡 APPROVE WITH SUGGESTIONS

Summary: This PR adds valuable infrastructure for handling binary data in artifacts. The stripBinaryDataForObservability function is correctly integrated and will improve observability by preventing large binary payloads in spans. However, two issues warrant attention: (1) the upload function lacks error handling unlike the existing image-upload.ts pattern, and (2) sanitizeArtifactBinaryData appears to be unused — clarification on whether this is intentional preparatory work would be helpful. Adding tests for the new storage key category would also strengthen the PR.

Discarded (3)

Location	Issue	Reason Discarded
`artifact-binary-sanitizer.test.ts`	Missing test for missing mimeType fallback	Low severity (INFO level); the behavior is implicitly covered and mimeType presence is common
`artifact-binary-sanitizer.test.ts`	Missing boundary test at exactly 100 chars	Nice-to-have but unlikely to cause real bugs; current tests cover above/below threshold adequately
`AgentSession.ts` spans	Redundant `JSON.stringify` pretty-print	Moved to Consider as it's a minor optimization, not a correctness issue

Reviewers (6)

Reviewer	Returned	Main Findings	Consider	Inline Comments	Discarded
`pr-review-precision`	3	1	1	0	1
`pr-review-tests`	6	2	0	2	2
`pr-review-sre`	4	1	1	1	1
`pr-review-consistency`	2	1	0	0	1
`pr-review-standards`	0	0	0	0	0
`pr-review-types`	0	0	0	0	0
Total	15	5	2	3	5

Note: Some findings were merged across reviewers (e.g., error handling raised by both tests and SRE reviewers).

claude · 2026-03-18T14:53:03Z

agents-api/src/domains/run/services/blob-storage/artifact-binary-sanitizer.ts

+    ext,
+  });
+
+  await storage.upload({ key, data: buffer, contentType: mimeType });


🟠 MAJOR: Missing error handling for storage upload failures

Issue: The uploadInlinePart function calls storage.upload() without a try-catch. If the upload fails (network error, quota exceeded, permission denied), the error propagates and could fail the entire artifact processing pipeline.

Why: The existing pattern in image-upload.ts (lines 90-98) catches upload failures gracefully. This inconsistency could cause unhandled errors and partial artifact state when storage is temporarily unavailable.

Fix: Consider wrapping the upload in a try-catch:

try { await storage.upload({ key, data: buffer, contentType: mimeType }); return { ...part, data: toBlobUri(key) }; } catch (error) { // Option 1: Return original part (fallback to base64) return part; // Option 2: Return placeholder // return { ...part, data: `[upload failed, binary ~${buffer.length} bytes]` }; }

Refs:

image-upload.ts:90-98 — existing error handling pattern

claude · 2026-03-18T14:53:07Z

agents-api/src/domains/run/services/blob-storage/artifact-binary-sanitizer.ts

+  [key: string]: unknown;
+};
+
+const CIRCULAR_REFERENCE_PLACEHOLDER = '[Circular Reference]';


🟡 Minor: Magic number threshold lacks documentation

Issue: The 100 character threshold for detecting binary data is hardcoded without explanation. This represents only ~75 decoded bytes, which may be smaller than intended.

Why: Without documentation, it's unclear whether this threshold was deliberately chosen or if a larger value (e.g., 1KB) would better distinguish binary content from short encoded strings.

Fix:

Suggested change

const CIRCULAR_REFERENCE_PLACEHOLDER = '[Circular Reference]';

const BINARY_DATA_MIN_LENGTH = 100; // ~75 decoded bytes; data below this is likely metadata, not binary payload

Then use the constant in the check below.

claude · 2026-03-18T14:53:12Z

agents-api/src/domains/run/services/__tests__/artifact-binary-sanitizer.test.ts

+      download: vi.fn(),
+      delete: vi.fn(),
+    });
+  });


🟡 Minor: Missing test for upload failure handling

Issue: The test suite mocks upload to always succeed but never tests the failure path.

Why: Given that image-upload.test.ts has a test for 'drops file part when storage.upload throws' (line 148), this gap masks the missing error handling in the implementation.

Fix: Add a test case:

it('propagates upload errors when blob storage fails', async () => { mockUpload.mockRejectedValueOnce(new Error('Storage quota exceeded')); const input = { type: 'image', data: LARGE_BASE64, mimeType: 'image/png' }; await expect(sanitizeArtifactBinaryData(input, CTX)).rejects.toThrow('Storage quota exceeded'); });

Refs:

image-upload.test.ts:148 — existing error handling test

claude · 2026-03-18T14:53:16Z

agents-api/src/domains/run/services/__tests__/artifact-binary-sanitizer.test.ts

+  it('leaves http URLs untouched', () => {
+    const input = { type: 'image', data: 'https://example.com/img.png', mimeType: 'image/png' };
+    const result = stripBinaryDataForObservability(input) as any;
+    expect(result.data).toBe('https://example.com/img.png');


🟡 Minor: Missing test for data: URI handling

Issue: Line 30 of the implementation checks !v.data.startsWith('data:') but there's no explicit test verifying data URIs are left untouched.

Why: Data URIs are a common format for inline images. Without a test, a refactor could accidentally remove the check, causing data URIs to be re-uploaded unnecessarily.

Fix: Add test cases:

it('leaves data: URIs untouched', () => { const dataUri = 'data:image/png;base64,' + LARGE_BASE64; const input = { type: 'image', data: dataUri, mimeType: 'image/png' }; const result = stripBinaryDataForObservability(input) as any; expect(result.data).toBe(dataUri); });

…artifact_binary_sanitizer

…_sanitizer

pullfrog · 2026-03-25T23:04:42Z

TL;DR — Adds infrastructure for offloading inline binary data (images, files) from artifact payloads to blob storage, and strips that binary data from observability spans so traces and logs don't balloon with base64 content.

Key changes

Add artifact-data category to buildStorageKey — extends the storage-key builder with a new ArtifactDataKeyInput discriminant that produces content-addressed keys scoped to tenant, project, and artifact.
New artifact-binary-sanitizer module — sanitizeArtifactBinaryData recursively walks arbitrary artifact data, detects inline base64 image/file parts, uploads them to blob storage, and replaces the data field with a blob:// URI. stripBinaryDataForObservability does the same walk but swaps data with a human-readable placeholder instead of uploading.
Handle data: URIs and graceful upload failures — the sanitizer parses data: URI strings to extract the embedded mime type and base64 payload before upload, and catches storage errors to return the original inline data rather than throwing.
Strip binary data from AgentSession observability attributes — four call sites in AgentSession.ts that serialize artifactData.data into span attributes now wrap the value in stripBinaryDataForObservability to keep traces lightweight.
Export parseDataUri from message-parts.ts — promotes an existing helper to a shared export so the binary sanitizer can reuse it.
Comprehensive tests for both sanitizer functions — covers upload-and-replace, data: URI extraction, idempotent blob URIs, graceful upload failure fallback, recursive nested structures, circular-reference safety, and observability placeholder formatting.

_{Summary ｜ 6 files ｜ 12 commits ｜ base: main ← stack/artifact_binary_sanitizer}

Content-addressed blob storage keys for artifact data

Before: buildStorageKey only supported conversation-media keys scoped to conversation/message.
After: A new artifact-data category produces keys like v1/t_{tenant}/artifact-data/p_{project}/a_{artifact}/sha256-{hash}.{ext}, enabling content-addressed deduplication per artifact.

The ArtifactDataKeyInput type is added as a union member to StorageKeyInput, and the switch in buildStorageKey gains a new case.

storage-keys.ts

Binary sanitizer for artifact data

Before: Inline base64 blobs in artifact data were stored and transmitted as-is — no offloading, no size reduction.
After: sanitizeArtifactBinaryData detects inline image/file parts (any data string that isn't a blob:// or http URL), uploads them to blob storage via content hash, and replaces data with a blob:// URI. Already-uploaded or URL-based data is left untouched.

The detection heuristic (isInlineBinaryPart) checks for type: 'image' | 'file', a non-empty data string, and excludes blob:// and http URLs — there is no minimum-length threshold. data: URIs are parsed via parseDataUri to extract the embedded mime type and raw base64 payload before hashing and upload. Both the upload sanitizer and the observability stripper share this heuristic and use a WeakSet-based cycle guard for circular references.

Upload failures are caught and logged with structured context (tenant, project, artifact, mime type, size) — the original inline data is returned unchanged, keeping the pipeline resilient to storage outages.

How does the observability stripper differ from the upload sanitizer?

stripBinaryDataForObservability is synchronous and never calls blob storage — it replaces detected binary data with a placeholder string like [binary data ~150 bytes, mimeType: image/png]. This keeps span attributes and log payloads small without requiring any I/O.

artifact-binary-sanitizer.ts · artifact-binary-sanitizer.test.ts · message-parts.ts

Observability stripping in `AgentSession`

Before: artifactData.data was serialized directly into span attributes via JSON.stringify with pretty-printing, embedding potentially megabytes of base64 into every trace.
After: All four serialization sites wrap the value in stripBinaryDataForObservability before stringifying, and drop the pretty-printing (null, 2) since span attributes don't benefit from it.

The affected call sites are the artifact receipt span, the artifact data preview for context truncation, and both artifact summary spans in the tool-result handling paths.

AgentSession.ts

^{｜ View workflow run ｜ Triggered by Pullfrog ｜ pullfrog.com ｜ 𝕏}

claude

PR Review Summary

(6) Total Issues | Risk: Medium

Delta Since Last Review

The only change since the prior review is a single import path fix:

-import { getExtensionFromMimeType } from '@inkeep/agents-core/constants/allowed-image-formats';
+import { getExtensionFromMimeType } from '@inkeep/agents-core/constants/allowed-file-formats';

This is a correct fix — the module was renamed from allowed-image-formats to allowed-file-formats in main. ✅

🕐 Pending Recommendations (6)

The following issues were raised in the prior review and remain unaddressed:

Major

🟠 artifact-binary-sanitizer.ts:53 Missing error handling for storage upload failures — uploadInlinePart lacks try-catch around storage.upload(), inconsistent with file-upload.ts pattern
🟠 sanitizeArtifactBinaryData Exported function is unused in production code — infrastructure exists but not wired to any data flow

Minor

🟡 storage-keys.ts:37-45 Missing unit tests for new artifact-data storage key category in storage-keys.test.ts
🟡 artifact-binary-sanitizer.ts:27 Magic number threshold (100) for binary detection lacks documentation
🟡 artifact-binary-sanitizer.test.ts Missing test for data: URI handling
🟡 artifact-binary-sanitizer.test.ts Missing test for upload failure scenario

🚫 REQUEST CHANGES

Summary: The delta since the prior review is a trivial import path fix which is correct. However, the 2 Major and 4 Minor issues identified in the previous review remain unaddressed. The most critical gaps are: (1) missing error handling for storage upload failures which creates inconsistency with the existing file-upload.ts pattern and could cause complete artifact processing failure on transient storage errors, and (2) sanitizeArtifactBinaryData being exported but unused, suggesting incomplete integration or preparatory infrastructure that should be documented.

Reviewers (0)

Reviewer	Returned	Main Findings	Consider	While You're Here	Inline Comments	Pending Recs	Discarded
N/A — delta review	0	0	0	0	0	6	0

Note: Domain-specific reviewers were not dispatched for this re-review because the delta consists only of a single import path fix. All 6 pending recommendations are carried forward from the prior review.

itoqa · 2026-03-26T00:33:30Z

Ito Test Report ❌

17 test cases ran. 1 failed, 16 passed.

Overall, 17 test cases were executed with 16 passing and 1 failing, showing strong stability across playground chat submission/double-send gating, traces conversation/history routing, outage and navigation resilience, artifact sanitization/copy-download behavior, mobile rendering, session invalidation, and authorization/large-payload protections. The key issue is a Medium-severity traces UI bug where untrusted trace text is rendered via markdown (Streamdown) in timeline/detail views, causing script-tag rendering instability instead of consistently inert plain-text handling and reducing observability reliability.

❌ Failed (1)

Category	Summary	Screenshot
Adversarial	🟠 Traces UI renders untrusted trace text via markdown, producing script-tag rendering instability instead of consistently inert text handling.

🟠 XSS payload in artifact fields is not executed

What failed: Untrusted trace text is rendered through Streamdown markdown components, and script-tag payloads cause rendering errors/instability instead of being handled as inert plain text.
Impact: Malformed or adversarial trace content can destabilize traces rendering and degrade operator visibility during debugging. This weakens reliability of the observability UI for affected conversations.
Introduced by this PR: No – pre-existing bug (code not changed in this PR)
Steps to reproduce:
1. Open a traces conversation containing tool/message fields with <script>-style payload text.
2. Navigate the timeline rows and message detail panel that render trace text fields.
3. Observe rendering instability/errors instead of strictly inert plain-text output.
Code analysis: I inspected traces activity shaping in the conversation API route and the UI timeline/detail renderers. The route builds descriptions from span attributes (including tool names), and UI components pass those strings directly into Streamdown without explicit sanitization or plain-text escaping on these paths.
Why this is likely a bug: Production code combines span-derived text with markdown rendering in the traces UI, creating a plausible and repeatable unsafe-rendering path for adversarial strings.

Relevant code:

agents-manage-ui/src/components/traces/timeline/timeline-item.tsx (lines 260-268)

<button
  type="button"
  onClick={onSelect}
  className={`flex items-center gap-1 group cursor-pointer transition-colors duration-200 ${textColorClass}`}
  title="Click to view details"
>
  <span className="font-medium">
    <Streamdown>{activity.description}</Streamdown>
  </span>

agents-manage-ui/src/components/traces/timeline/render-panel-content.tsx (lines 262-265)

<LabeledBlock label="Message content">
  <Bubble className="break-words">
    <Streamdown>{a.messageContent || 'Message content not available'}</Streamdown>
  </Bubble>
</LabeledBlock>

agents-manage-ui/src/app/api/traces/conversations/[conversationId]/route.ts (lines 765-771)

const toolCall = getString(span, SPAN_KEYS.SPAN_ID, '');
activities.push({
  id: toolCall,
  type: ACTIVITY_TYPES.TOOL_CALL,
  toolName: name,
  description: hasError && statusMessage ? `Tool ${name} failed` : `Called ${name}`,
  timestamp: span.timestamp,

✅ Passed (16)

Category	Summary	Screenshot
Adversarial	Tampered deep-link requests remain tenant/project scoped; no cross-scope payload exposure was verified.
Adversarial	Large inline binary-like artifact data is sanitized/stripped, with oversized retrieval warnings preserved for traces UI.
Adversarial	Cross-tab logout invalidated the other tab session and reloading redirected to login as expected.
Edge	Rapid 5x send attempts were gated to one submission and the chat UI remained stable.
Edge	Mobile traces detail flow at 390x844 remained readable and interactive with no horizontal overflow.
Edge	Short binary-like artifact data remains visible without over-redaction.
Edge	Circular placeholder text stays inert and export flows keep valid JSON.
Edge	Oversized artifact warning and metadata remain visible and accurate.
Edge	Rapid side-panel open/close interactions remain stable and keep artifact details synchronized.
Happy-path	Playground prompt submission completed and assistant output rendered successfully.
Happy-path	Trace conversation route includes artifact_processing query/mapping and handles SigNoz connectivity errors gracefully; no product logic defect found.
Happy-path	Deep-linked traces conversation detail handled forced backend outage with a stable error state across reloads and no crash.
Happy-path	Conversation history route behavior remains coherent; route-level integration tests pass for artifact-heavy message shaping.
Screen	Artifact processing timeline card renders sanitized artifact data with binary placeholder in local UI fixture flow.
Screen	Artifact details panel copy/download actions export sanitized JSON without raw binary leakage.
Screen	Non-linear traces navigation preserved conversation context via URL-scoped conversation routes and rehydration on refresh/deep-link flows.

Commit: 4c9a209

View Full Run

Tell us how we did: Give Ito Feedback

This function is intentionally exported for use in artifact persistence (see PR #2745). The observability-only stripping path is `stripBinaryDataForObservability`. export async function sanitizeArtifactBinaryData()

claude

PR Review Summary

(0) Total Issues | Risk: Low

Delta Since Last Review

The author addressed all 6 prior findings from the previous review:

Prior Finding	Status	Implementation
🟠 Missing error handling for upload	✅ FIXED	Try-catch added with graceful fallback (lines 58-73)
🟠 Unused `sanitizeArtifactBinaryData`	✅ Understood	Intentional preparatory infrastructure
🟡 Missing `artifact-data` storage key test	✅ FIXED	Test added (storage-keys.test.ts:61-72)
🟡 Magic number (100) threshold	✅ RESOLVED	Changed to `> 1` — now detects all binary data
🟡 Missing `data:` URI test	✅ FIXED	Tests added for both stripping (lines 53-61) and sanitization (lines 193-206)
🟡 Missing upload failure test	✅ FIXED	Test added (lines 208-214)

New Changes in Delta

The delta commit (f5f09e008) adds:

parseDataUri integration — Data URIs are now parsed and their base64 payloads extracted (mimeType from URI takes precedence)
Robust error handling — Upload failures log comprehensive context and gracefully fall back to original inline data
Pretty-print removal — Removed null, 2 from JSON.stringify calls in span attributes (compact format)
Comprehensive test coverage — All prior gaps addressed with well-structured tests

💭 Consider (1) 💭

Inline Comments:

💭 Consider: artifact-binary-sanitizer.test.ts:22 Missing boundary test for threshold change from > 100 to > 1

🕐 Pending Recommendations (1)

🟠 sanitizeArtifactBinaryData Function is exported and tested but unused in production — understood as preparatory infrastructure for a follow-up PR that wires it into artifact persistence

✅ APPROVE

Summary: Excellent follow-up work. All prior feedback has been addressed with well-implemented solutions. The error handling pattern (try-catch with graceful fallback to original data) is consistent with existing patterns in the codebase. The sanitizeArtifactBinaryData function being unused is understood as preparatory infrastructure — the observability stripping via stripBinaryDataForObservability is the production-ready piece that's correctly integrated into AgentSession.ts. Ship it! 🚀

Discarded (2)

Location	Issue	Reason Discarded
`artifact-binary-sanitizer.ts:45`	`Buffer.from` doesn't validate base64 input	Edge case — image/file parts from LLM tool results should be valid base64; any corruption would be visible when artifact fails to render
`artifact-binary-sanitizer.test.ts:193`	Missing test for malformed data URIs	Low priority edge case — `parseDataUri` returns null, code safely falls back to treating entire string as base64

Reviewers (3)

Reviewer	Returned	Inline Comments	Discarded
`pr-review-standards`	0	0	0
`pr-review-tests`	2	1	1
`pr-review-errors`	3	0	1
Total	5	1	2

Note: Most reviewer findings were INFO-level acknowledgments of correctly implemented prior feedback.

claude · 2026-03-26T01:47:49Z

agents-api/src/domains/run/services/__tests__/artifact-binary-sanitizer.test.ts

+  ),
+}));
+
+const SAMPLE_BASE64 = Buffer.from('sample image bytes').toString('base64');


💭 Consider: Missing boundary test for threshold change

Issue: The threshold changed from > 100 to > 1, but no test verifies this boundary. SAMPLE_BASE64 (~24 chars) passes both thresholds.

Why: Without a boundary test, a future refactor could accidentally change > 1 to >= 1 or back to > 100 without test failure.

Fix: Consider adding boundary tests:

it('treats 2-char data string as binary part', () => { const input = { type: 'image', data: 'AA', mimeType: 'image/png' }; const result = stripBinaryDataForObservability(input) as any; expect(result.data).toMatch(/^\[binary data/); }); it('does not treat 1-char data string as binary part', () => { const input = { type: 'image', data: 'A', mimeType: 'image/png' }; const result = stripBinaryDataForObservability(input) as any; expect(result.data).toBe('A'); });

claude

PR Review Summary

(0) Total Issues | Risk: Low

Delta Since Last Review

The only change since the prior review is commit 7dd41d83b ("Remove unnecessary check"):

-    v.data.length > 1 &&

This removes the minimum length check from isInlineBinaryPart, so now any non-empty string data (that isn't a blob:// or http URL) is treated as a binary part.

Analysis

Previous state: v.data.length > 1 — strings of 0 or 1 characters were excluded
New state: No length check — any non-empty string is included

Is this correct? ✅ Yes. The removed check was overly restrictive:

Single-character base64 data is valid (though rare in practice)
Empty strings are already implicitly excluded — if data is '', it would still match typeof v.data === 'string', but:
- stripBinaryDataForObservability would produce [binary data ~0 bytes] — harmless
- sanitizeArtifactBinaryData would upload a 0-byte file — also harmless (content-hash deduplication handles this)
The prior "Consider" item about missing boundary tests is now moot — the simpler heuristic doesn't need boundary testing

Impact: Simplifies the detection heuristic with no functional regression. Edge cases (empty/tiny strings) are handled safely.

🕐 Pending Recommendations (1)

🟠 sanitizeArtifactBinaryData Function is exported and tested but unused in production — understood as preparatory infrastructure for a follow-up PR

✅ APPROVE

Summary: Clean simplification. Removing the > 1 length check makes the binary detection heuristic more straightforward without introducing regressions. The prior review's APPROVE stands — all feedback has been addressed, and this final tweak is a sensible cleanup. 🚀

Reviewers (0)

Reviewer	Returned	Main Findings	Consider	While You're Here	Inline Comments	Pending Recs	Discarded
N/A — trivial delta review	0	0	0	0	0	1	0

Note: Domain-specific reviewers were not dispatched because the delta consists only of a single line removal. The pending recommendation is carried forward from prior reviews.

itoqa · 2026-03-26T03:26:06Z

Ito Test Report ✅

12 test cases ran. 12 passed.

Across the unified QA run, all 12 executed test cases passed with 0 failures, and verification found no confirmed or plausible production-code defects. Most importantly, run-route handling worked for inline and nested mixed-content inputs with successful HTTP 200 SSE streaming, external/failed PDF ingestion paths consistently returned controlled HTTP 400 responses with URL token/query/hash redaction and fallback behavior (no 5xx), and traces/artifact and UI stress checks (sanitized payload placeholders with preserved mimeType metadata, clean JSON downloads, mobile/back-forward/deep-link/filter abuse) remained stable and usable despite recoverable telemetry configuration errors.

✅ Passed (12)

Category	Summary	Screenshot
Adversarial	Verified that PDF URL ingestion errors sanitize source URLs by stripping query/hash values, preventing sensitive token leakage in error surfaces.	N/A
Adversarial	Traces UI remained usable through deep-link query fuzzing, rapid filter toggles, back/forward loops, and iPhone 12 viewport checks.
Edge	Nested mixed-content payload parsing and SSE streaming completed successfully in re-verification.
Edge	Mobile artifact panel usability remained intact for large scrubbed payload flows after re-investigation.
Edge	Back/forward navigation loops preserved conversation artifact context without losing the JSON panel state.
Logic	Re-verification confirmed inline base64/data URI payloads are replaced by binary-placeholder text before trace attributes are emitted.
Logic	Artifact trace JSON preserves `mimeType` context in sanitized placeholder output.
Logic	Downloaded artifact JSON remained sanitized and excluded raw binary/base64 signatures.
Logic	Mixed text plus failing remote file input produced a controlled fallback response without 5xx.
Logic	Persisted artifact data rendered the expected key shape (`placeholder`, `mimeType`, `sanitized`, `originalBytes`, `reason`) without raw payload leakage.
Happy-path	Inline PDF data URI request returned expected SSE streaming response (HTTP 200).
Happy-path	Unreachable external PDF URL returned a controlled HTTP 400 with sensitive token redaction.

Commit: 7dd41d8

View Full Run

Tell us how we did: Give Ito Feedback

tim-inkeep · 2026-03-26T18:48:12Z

@mike-inkeep does this only find binary data within artifacts that have been saved? what about the artifacts themselves being moved to blob storage?

mike-inkeep added 2 commits March 18, 2026 10:36

Return MessageSelect[] from history APIs and format at generation cal…

8eea5bf

…l sites

Add artifact binary sanitizer and strip binary payloads from observab…

c7043e0

…ility logs

vercel bot deployed to Preview – agents-manage-ui March 18, 2026 14:48 View deployment

claude bot reviewed Mar 18, 2026

View reviewed changes

github-actions bot deleted a comment from claude bot Mar 18, 2026

This was referenced Mar 18, 2026

fix(agents-api): sanitize artifact binary data before persistence #2680

Closed

feat(images): enable multi-turn conversations about image context parts #2604

Closed

mike-inkeep added 6 commits March 23, 2026 19:07

Merge remote-tracking branch 'origin/main' into stack/history_shape

a96c949

Further review feedback and tightening of types

4e1416a

Merge remote-tracking branch 'origin/main' into stack/history_shape

37a910e

More cleanup

513ed69

Merge remote-tracking branch 'origin/main' into stack/history_shape

393705c

Merge remote-tracking branch 'origin/stack/history_shape' into stack/…

04ecfa7

…artifact_binary_sanitizer

Base automatically changed from stack/history_shape to main March 25, 2026 20:07

mike-inkeep added 2 commits March 25, 2026 16:30

Merge remote-tracking branch 'origin/main' into stack/artifact_binary…

12558e9

…_sanitizer

fix filepath

4c9a209

vercel bot deployed to Preview – agents-docs March 25, 2026 23:05 View deployment

vercel bot deployed to Preview – agents-manage-ui March 25, 2026 23:05 View deployment

vercel bot deployed to Preview – agents-api March 25, 2026 23:06 View deployment

claude bot reviewed Mar 25, 2026

View reviewed changes

github-actions bot deleted a comment from claude bot Mar 25, 2026

PR feedback and refactor

f5f09e0

This function is intentionally exported for use in artifact persistence (see PR #2745). The observability-only stripping path is `stripBinaryDataForObservability`. export async function sanitizeArtifactBinaryData()

vercel bot temporarily deployed to Preview – agents-docs March 26, 2026 01:43 Inactive

vercel bot deployed to Preview – agents-manage-ui March 26, 2026 01:44 View deployment

vercel bot deployed to Preview – agents-api March 26, 2026 01:46 View deployment

claude bot reviewed Mar 26, 2026

View reviewed changes

github-actions bot deleted a comment from claude bot Mar 26, 2026

Remove unnecessary check

7dd41d8

vercel bot temporarily deployed to Preview – agents-docs March 26, 2026 01:56 Inactive

vercel bot deployed to Preview – agents-manage-ui March 26, 2026 01:57 View deployment

claude bot reviewed Mar 26, 2026

View reviewed changes

github-actions bot deleted a comment from claude bot Mar 26, 2026

vercel bot deployed to Preview – agents-api March 26, 2026 02:01 View deployment

mike-inkeep requested review from tim-inkeep March 26, 2026 18:00

	const CIRCULAR_REFERENCE_PLACEHOLDER = '[Circular Reference]';
	const BINARY_DATA_MIN_LENGTH = 100; // ~75 decoded bytes; data below this is likely metadata, not binary payload

Conversation

mike-inkeep commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key decisions

Manual QA

Uh oh!

changeset-bot bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

vercel bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

PR Review Summary

🟠⚠️ Major (2) 🟠⚠️

🟡 Minor (4) 🟡

💭 Consider (2) 💭

💡 APPROVE WITH SUGGESTIONS

Uh oh!

claude bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Key changes

Content-addressed blob storage keys for artifact data

Binary sanitizer for artifact data

Observability stripping in AgentSession

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

PR Review Summary

Delta Since Last Review

🕐 Pending Recommendations (6)

Major

Minor

🚫 REQUEST CHANGES

Uh oh!

itoqa bot commented Mar 26, 2026

Ito Test Report ❌

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

PR Review Summary

Delta Since Last Review

New Changes in Delta

💭 Consider (1) 💭

🕐 Pending Recommendations (1)

✅ APPROVE

Uh oh!

claude bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

PR Review Summary

Delta Since Last Review

Analysis

🕐 Pending Recommendations (1)

✅ APPROVE

Uh oh!

itoqa bot commented Mar 26, 2026

Ito Test Report ✅

Uh oh!

tim-inkeep commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

mike-inkeep commented Mar 18, 2026 •

edited

Loading

changeset-bot bot commented Mar 18, 2026 •

edited

Loading

vercel bot commented Mar 18, 2026 •

edited

Loading

pullfrog bot commented Mar 25, 2026 •

edited

Loading

Observability stripping in `AgentSession`